Design and Enhancement of a Fog-Enabled Air Quality Monitoring and Prediction System: An Optimized Lightweight Deep Learning Model for a Smart Fog Environmental Gateway

Effective air quality monitoring and forecasting are essential for safeguarding public health, protecting the environment, and promoting sustainable development in smart cities. Conventional systems are cloud-based, incur high costs, lack accurate Deep Learning (DL)models for multi-step forecasting, and fail to optimize DL models for fog nodes. To address these challenges, this paper proposes a Fog-enabled Air Quality Monitoring and Prediction (FAQMP) system by integrating the Internet of Things (IoT), Fog Computing (FC), Low-Power Wide-Area Networks (LPWANs), and Deep Learning (DL) for improved accuracy and efficiency in monitoring and forecasting air quality levels. The three-layered FAQMP system includes a low-cost Air Quality Monitoring (AQM) node transmitting data via LoRa to the Fog Computing layer and then the cloud layer for complex processing. The Smart Fog Environmental Gateway (SFEG) in the FC layer introduces efficient Fog Intelligence by employing an optimized lightweight DL-based Sequence-to-Sequence (Seq2Seq) Gated Recurrent Unit (GRU) attention model, enabling real-time processing, accurate forecasting, and timely warnings of dangerous AQI levels while optimizing fog resource usage. Initially, the Seq2Seq GRU Attention model, validated for multi-step forecasting, outperformed the state-of-the-art DL methods with an average RMSE of 5.5576, MAE of 3.4975, MAPE of 19.1991%, R2 of 0.6926, and Theil’s U1 of 0.1325. This model is then made lightweight and optimized using post-training quantization (PTQ), specifically dynamic range quantization, which reduced the model size to less than a quarter of the original, improved execution time by 81.53% while maintaining forecast accuracy. This optimization enables efficient deployment on resource-constrained fog nodes like SFEG by balancing performance and computational efficiency, thereby enhancing the effectiveness of the FAQMP system through efficient Fog Intelligence. The FAQMP system, supported by the EnviroWeb application, provides real-time AQI updates, forecasts, and alerts, aiding the government in proactively addressing pollution concerns, maintaining air quality standards, and fostering a healthier and more sustainable environment.


Introduction
The escalating issue of air pollution poses a significant concern due to its widespread impact on human health and the environment, eliciting attention from industrialists, governments, academicians, and communities worldwide.According to the World Health Organization (WHO), over 92% of cities worldwide do not meet the established air quality guidelines.Air pollution is a major problem, particularly in developing nations like India, which ranks third in greenhouse gas emissions after China and the United States [1].Despite government efforts, air quality levels have significantly worsened over the years due to various factors, including industrialization, urbanization, weather conditions, geographical features, and vehicular emissions.Air pollution is considered the most significant Sensors 2024, 24, 5069 2 of 46 environmental health threat, where 9 out of 10 people breathe polluted air, causing seven million deaths globally every year [2,3].It increases the risk of chronic respiratory diseases, impairs cognitive function, contributes to cardiovascular diseases and cancer, and increases susceptibility to viral infections like COVID-19 [4].Elevated pollution levels seriously affect public health, the climate, the economy, and the ecosystem [5].With 68% of India's population projected to live in urban areas by 2050, the current monitoring infrastructure is insufficient.The Central Pollution Control Board (CPCB) has announced that India plans to double the air quality monitoring stations to address these existing challenges.The comprehensive monitoring requirements and the pressing pollution threats necessitate the development of an efficient IoT-based architecture for air quality monitoring and forecasting systems using state-of-the-art technologies.This would enable policymakers to create strategies for pollution prevention and preemptive actions, thereby improving public health, environmental protection, urban planning, public awareness, economic benefits, and regulatory support in smart cities.
Experts estimate that 75 billion Internet of Things (IoT) devices will be connected to cyberspace by 2025 [6].The advancements in the IoT and Artificial Intelligence (AI) have fueled the development of smart city applications, generating vast and diverse datasets.Efficiently analyzing these growing data in real time and making timely decisions is essential for improving urban life.While Cloud Computing offers substantial computational and storage capabilities for the data, it is limited by computational overhead, bandwidth constraints, and latency [7,8].To overcome these limitations, Fog Computing (FC) was introduced.FC extends Cloud Computing by bringing computation, communication, storage, and networking closer to IoT data sources at the network's edge in a decentralized manner [9].By enabling data processing at fog gateways or nodes positioned between the edge (terminal) and the cloud layer, rather than sending all data directly to the cloud [10], FC reduces reliance on the cloud connections and minimizes potential data flow interruptions [8].FC offers benefits like low latency, minimized bandwidth usage, real-time decision making and responses, contextual awareness, and scalability, effectively satisfying the Quality of Service(QoS) requirements for smart city applications [11].
To address the challenges in real-time applications, researchers have turned to advanced technologies such as IoT and Fog Computing.Despite the growing significance of FC, the majority of the existing Air Quality Monitoring (AQM) solutions are cloud-based, leading to high monitoring and communication costs.These solutions often lack real-time and accurate multi-step forecasting and timely decision making and fail to optimize Deep Learning (DL) models for the efficient deployment on fog nodes.To address these challenges, this paper proposes a novel Fog-enabled Air Quality Monitoring and Prediction (FAQMP) system, featuring a low-cost AQM node with efficient communication and a Smart Fog Environmental Gateway (SFEG) that incorporates efficient Fog Intelligence to enhance real-time decision support in smart cities.The system effectively integrates state-of-the-art technologies, including Fog Computing (FC), IoT, LPWAN, and Deep Learning (DL).Specifically, Fog Intelligence enables the execution of DL models on fog nodes at the network's edge.However, the resource constraints of fog nodes pose challenges, leading to innovations in model optimization, energy efficiency, and real-time data processing [12].With this concern, this paper highlights achieving efficient Fog Intelligence by optimizing DL models for fog nodes like SFEG through an optimized lightweight DL-based Sequence-to-Sequence (Seq2Seq) Gated Recurrent Unit (GRU) model.This model strikes a balance between model performance and computational efficiency, thereby enabling efficient utilization of SFEG resources.
The proposed FAQMP system features a hierarchical, three-layered architecture with the sensing layer, the FC layer, and the cloud (CC) layer.At the sensing layer, an AQM node is equipped with a customized PCB and low-cost sensors to acquire the major air pollutants data that contribute to Air Quality Index (AQI) levels, including particulate matter 2.5 (PM 2.5 ), particulate matter 10 (PM 10 ), nitrogen dioxide (NO 2 ), sulfur dioxide (SO 2 ), carbon monoxide (CO), ozone (O 3 ), and meteorological data, including temperature, Sensors 2024, 24, 5069 3 of 46 pressure, humidity, wind speed (WS), wind direction (WD), and solar radiation (SR).This data is transmitted to the FC layer via LoRa, a Low-Power Wide-Area Network (LPWAN) technology known for its long-range capabilities, cost-effectiveness, and low power consumption [13,14].The FC layer utilizes Raspberry Pi 3 Model B+ [15] as the Smart Fog Environmental Gateway (SFEG) embedded with Fog Intelligence.The SFEG processes and manages data, provides real-time forecasting, and supports decision making.It includes an Early Warning System (EWS) that detects anomalies and triggers alerts, enabling preemptive actions and rapid responses.The processed data are then sent to the cloud for long-term storage and complex analysis using MQTT.Additionally, the EnviroWeb application provides users with real-time air quality data, forecasts, trends, and alerts, helping them make informed decisions and improve public health and safety.
The proposed system addresses the challenges of DL model deployment for real-time air quality forecasting through fog-cloud collaboration.This collaboration allows to train and optimize the DL model in the cloud and then transfer it to the SFEG to facilitate Fog Intelligence.Initially, a DL-based Seq2Seq GRU Attention model is proposed, which outperforms the state-of-the-art DL baseline models by effectively capturing temporal dependencies in multi-step air quality forecasting.This initial DL model is further optimized through post-training quantization to be lightweight and efficient for deployment on resource-limited fog nodes like SFEG.This optimization enables efficient Fog Intelligence by reducing model size and computational latency while maintaining forecast accuracy.
To the best of our knowledge, no existing studies have developed an end-to-end air quality monitoring and forecasting system by incorporating efficient Fog Intelligence.The proposed FAQMP system effectively integrates Fog Computing with LPWAN, creating a decentralized architecture that reduces monitoring and communication costs.Specifically, it introduces an efficient Fog Intelligence on the SFEG using an optimized lightweight DL model, enabling real-time processing, accurate multi-step forecasting, and timely early warnings and event response to anomalous events, which are crucial for decision support in smart city environments to improve public health and safety.
The main contributions of the work are summarized below: 1.
Proposed a novel Fog-enabled Air Quality Monitoring and Prediction (FAQMP) system leveraging IoT with Fog Computing, LPWAN, and DL, aiming to support real-time and low-cost monitoring with accurate forecasting for decision support in smart cities.

2.
Developed a Smart Fog Environmental Gateway (SFEG) that introduces efficient Fog Intelligence in the FC layer through fog-cloud collaboration.

3.
Developed a user-friendly web application, namely EnviroWeb, to present real-time air quality AQI trends, forecasts, early warnings, and alerts to the users.4.
Proposed a DL-based Seq2Seq GRU Attention model for multivariate multi-step time series air quality forecasting.The model demonstrates superior performance and stability in forecasting air quality for multiple time steps in comparison against baseline models.5.
Developed an optimized lightweight DL model that facilitates efficient Fog Intelligence on the SFEG by striking a balance between computational efficiency and model performance.
The remainder of the paper is structured as follows: Section 2 provides an overview of the related works of air quality monitoring and forecasting systems and discusses the technologies enabling Fog Intelligence in IoT environments.Section 3 presents the proposed framework; the tools and technologies involved; the implementation of fog-cloud collaboration; the scalability aspects; and the real-world impacts.Section 4 describes the methodology for multivariate multi-step air quality forecasting.Section 5 presents the experimental evaluation and results of the proposed optimized lightweight DL model tailored for efficient Fog Intelligence.Finally, Section 6 discusses the concluding remarks and outlines future work.

Related Works
This section reviews related works on IoT-based air quality monitoring and forecasting systems and delves into methods and technologies that facilitate efficient Fog Intelligence in IoT environments.Each of the following subsections discusses the challenges, research gaps, and contributions to the proposed approach.

Air Quality Monitoring (AQM) Systems
The need for efficient AQM arises from the impact of air pollution on public health and the environment.Government agencies have established a quantitative tool called the Air Quality Index (AQI) [16,17] to express air quality levels and inform the public about the degree of pollution.Each country has its own standards to define the AQI.For example, in India, the AQI is calculated from six major pollutants: PM 2.5 , PM 10 , CO, NO 2 , SO 2 , and O 3 .The AQI converts the pollutant values into a single index value in the range 0-500 [14] classified under one of the six categories: good, moderate, sensitive, unhealthy, very unhealthy, and dangerous, as presented in Table 1.The higher the AQI levels, the greater the severity of pollution and its impact on human health.The individual AQI for the pollutants is denoted by I 1 , I 2 . .., and I n and calculated using the linear interpolation method as in Equation (1).
C high -Breakpoint concentration greater than or equal to the given concentration.C low -Breakpoint concentration less than or equal to the given concentration.I high -AQI value corresponding to C high .I low -AQI value corresponding to C low .C-Concentration of the pollutant.n-Total number of pollutant.
Numerous studies have advanced IoT-based air quality monitoring (AQM) systems using a range of sensors, microcontrollers, and communication technologies.For instance, Dhingra et al. [18] developed the IoT Mobair kit with gas sensors, Arduino Uno, ESP8266 Wi-Fi module to provide real-time air quality data, and a map with pollution levels using Ubidots services.Similarly, Laskar et al. [19] created a framework that displays air quality data from sensors installed on roadways, recommending routes with the lowest AQI Sensors 2024, 24, 5069 5 of 46 through an Android application for health-conscious users to make informed decisions.Alam et al. [20] introduced a low-cost system for measuring various environmental parameters but did not monitor all major pollutants responsible for AQI calculations or offer real-time alerts.Kelechi et al. [21] emphasized the affordability and accessibility of their low-cost AQM system for widespread deployment.While Kumar et al. [22] developed a low-cost, decentralized, and portable AQM system called AIRO that monitors air quality values, sends data to the AWS cloud, and forecasts AQI, it overlooks the meteorological parameters that likely influence air quality levels.Few studies have considered monitoring the meteorological parameters; for instance, Bobulski et al. [23] designed a monitoring system, highlighting the relationship between atmospheric pressure and particulate matter (PM10 and PM2.5).A recent work by Cabrera et al. [24] developed an affordable AQM system that monitored both meteorological parameters and pollutants.They stress that meteorological factors such as temperature, pressure, humidity, wind speed, and wind direction significantly influence the dilution and diffusion of air pollutants, subsequently affecting their distribution and concentration levels across seasons.Monitoring meteorological parameters in AQM is crucial for understanding the complex interactions between atmospheric conditions and air quality, as well as for grasping how seasonal changes impact air quality, which enables more accurate predictions and air pollution management.Similarly, Asha et al. [25] introduced the ETAPM-AIT model, which considered monitoring meteorological parameters and pollutants to provide real-time updates and alerts on hazardous conditions.Arroyo et al. [26] designed a portable system that tracks air quality, but it suffers from a limited communication range due to its reliance on short-range communication technologies, reducing its effectiveness for wide-area monitoring.A similar work by Barthwal et al. [27] used Bluetooth in the developed IoT sensing system to transmit air quality data to an Android app that limits its coverage and effectiveness.
While the aforementioned systems are cloud-based, they lack real-time processing capabilities at the edge of the network, which is crucial for an immediate response to air quality changes.Furthermore, transmitting data directly to the cloud becomes expensive due to the associated costs of data transmission, processing, and storage.In addition, most of the AQM systems employ short-range communication technologies like Bluetooth, Zigbee, and Wi-Fi, leading to limited coverage range, higher power consumption, and increased susceptibility to interference in urban environments.Regarding the hardware platforms in AQM systems, many utilize RPi [28][29][30] as the end device, which is expensive compared to microcontrollers like NodeMCU, Arduino Mega, and Arduino Uno, especially when deploying multiple nodes in the same network.IoT advancements have resulted in the development of low-cost sensors that are compact, lightweight, and energy-efficient.These improvements have facilitated the comprehensive deployment of AQM systems.Despite the popularity of low-cost sensors, there are reliability issues due to environmental sensitivity, instability, and manufacturing defects [31].To address these challenges, sensors need to be calibrated to ensure accurate air quality measurements.For example, Koziel et al. [32] developed a simplified field calibration method for low-cost particulate matter (PM) sensors.The growing sensor market has driven advancements in calibration techniques to manage sensor drifts under harsh conditions and help improve the accuracy and reliability of sensor data, which are crucial aspects for effective AQM and analysis.
To overcome the limitations of cloud-based AQM systems discussed above, Fog Computing-based AQM solutions [33,34] with LPWAN [35] are gaining significant emphasis.FC brings storage, computation, and networking closer to the edge for faster processing, reduced latency, reliability, resilience to network failures, and minimized bandwidth consumption and reduced reliance on the cloud.For instance, Senthil Kumar et al. [36] proposed a fog-based intelligent air pollution monitoring system and analyzed the influence of meteorological parameters on AQI trends but did not integrate LPWAN for efficient communication with the fog node.In contrast, Santos et al. [37] emphasized the significance of integrating LPWAN in FC.Their fog-based system with anomaly detection, deployed on a Bpost delivery car in Antwerp, monitored PM 1 , PM 2.5 , and PM 10 and promptly alerted cit-izens to elevated pollution levels.The study found that integrating FC improved response times compared to cloud-only solutions and underscored the importance of integrating LPWAN technologies in FC for efficient, real-time, and scalable AQM in smart cities.
LPWAN technologies overcome the limitations of short-range communication by supporting long-range transmission, low-power consumption, decentralization, prolonged network battery life, and energy efficiency, making them well suited for the resourceconstrained fog environments.LoRa utilizes chirp spread spectrum modulation and facilitates robust communication over long distances while maintaining resistance to interference and noise, making it a suitable and cost-effective communication technology for AQM.Among LPWAN technologies, LoRa emerges as a promising solution [38] compared to Narrowband IoT (NB-IoT) [39] and Sigfox for smart city applications like AQM [40].
Despite significant advancements in AQM, several research gaps require attention.As summarized in Table 2, most existing AQM systems are cloud-based systems that capture only a few pollutant parameters that contribute to AQI, lack meteorological data, use short-range communication technologies, fail to adopt gateways as the fog node for intelligent processing at the edge, and ignore real-time air quality forecasting with timely early warnings for decision making in smart cities.Furthermore, these systems rarely integrate LPWAN and Fog Computing (FC) to meet real-time requirements.To address these gaps, our work effectively integrates Fog Computing, LPWAN, and Cloud Computing technologies to develop a low-cost and efficient FAQMP system.It addresses the limitations of short-range communication technologies by using LoRa, effectively captures the pollutants along with the meteorological parameters, and introduces a Smart Fog Environmental Gateway that introduces intelligence for real-time forecasting and early warnings with event response to varying environmental situations.By leveraging the capabilities of Fog Computing and LoRa, the system will benefit from improved performance, real-time decision making, low latency, reduced costs, conservation of bandwidth, contextual awareness, and reliability.Additionally, the real-time EnviroWeb application presents live air quality data, AQI levels, historical trends, forecasts, and alerts to the stakeholders.Moreover, the proposed AQM system will serve as a foundation for accurate air quality forecasting.

Air Quality Forecasting Systems
The impact of air pollution on public health and the environment is substantial.Therefore, accurate multi-step air quality forecasting is crucial for decision making, early warnings, and strategies to address pollution concerns and maintain safe AQI levels in smart cities.Air quality forecasting involves estimating pollutant concentrations for future time steps based on historical air quality data.However, accurate multi-step forecasting is challenging due to the influence of environmental, meteorological, and various other conditions [41] on pollutant patterns.The common approaches for air quality forecasting include statistical and data-driven methods, such as Machine Learning (ML) and Deep Learning.Although traditional statistical approaches like Multiple Linear Regression (MLR), Moving Average (MA), Autoregressive (AR), Autoregressive Moving Average (ARMA), and Autoregressive Integrated Moving Average (ARIMA) [42,43] are suitable for short-term air quality forecasting, their intrinsic linearity assumptions limit their ability to solve nonlinear problems, leading to high errors and poor model stability.Meanwhile, AI gained importance due to its capability to handle non-linearity in time series data.The studies in [44][45][46] utilized AI-based ML techniques like Support Vector Regression (SVR), Gradient Boost Regressor (GBR), Adaboost, and Stacking Ensemble to enhance forecasting accuracy.Although ML approaches have shown promising results, they fail to capture complex time series patterns and long-term dependencies in time series data affecting the forecast accuracy.
Later, DL methods emerged as a promising solution to model and forecast time series data in academia and industry.With the ability to perform non-linear mapping, self-adapt, and model complex relationships between inputs and output variables, DL enhances the capability to effectively capture temporal trends and achieve robust prediction performance [47].The commonly used DL methods for air quality forecasting are Recurrent Neural Networks (RNNs), LSTM, and GRU [48,49].While RNN is effective for sequential analysis, it faces a gradient explosion problem that is addressed by LSTM.LSTM addresses long-term dependency challenges in capturing time series data [50][51][52][53].For example, Athira et al. [54] compared RNN models, finding that GRU slightly outperformed RNN and LSTM in predicting PM 10 .Similarly, Lin et al. [55] explored five GRU-based predictive models, which exhibited superior performance over SVR, GBT, LSTM, and Aggregated LSTM (ALSTM) models.However, GRU-based models demonstrated good performance in forecasting air quality across various scenarios, including seasonal variations, spatial distributions, and temporal patterns.They introduced an ensemble model called Multiple Linear Regression-based GRU (MLEGRU).In addition, Liu et al. [56] mentioned that shallow and DL predictors have evolved into hybrid approaches for enhanced forecasting performance.Hybrid forecasting models are based on combinations of multiple DL models to solve time series air quality data that have variations in time and space.It combines the advantages of multiple individual models.Several studies [57][58][59] have demonstrated that a hybrid RNN forecasting model yields better performance due to its capability to deal with complex and non-linear datasets, while also adapting to various forecasting requirements.
Furthermore, research on meteorological conditions influencing air quality levels primarily considered the correlation between air quality and meteorological conditions, often overlooking their influence in forecasting.To address this gap, our previous work [14] analyzed the influence of meteorological factors in forecasting and illustrated how incorporating these parameters enhanced accuracy.In addition, recently, researchers [60,61] also assessed the influence of meteorological conditions in forecasting air quality levels.
Most of the air quality forecasting studies primarily focused on single-step air quality forecasting methods.However, these methods only forecast air quality levels for the immediate time step and fail to offer insights into future steps.Consequently, multi-step forecasting methods have been introduced to effectively capture the behavior of future hours air quality data and provide a more detailed and actionable view of future air quality conditions, enhancing both short-term responses and long-term strategies for pollution management.For example, Chang et al. [62] implemented an ALSTM model that outper-formed GBTR, SVR, and LSTM to predict PM2.5 for the next 8 h.Du et al. [63] proposed a novel DAQFF model for multi-step forecasting of PM2.5 up to 6 h, utilizing one-dimensional CNNs and Bi-LSTM.Kow et al. [64] suggested a CNN-BPNN model for regional multistep-ahead forecasting that produced accurate forecasts compared to individual BPNN, RF, and LSTM models.Janarathan et al. [65] introduced an SVR-LSTM method for AQI prediction in a metropolitan city, outperforming the RNN, LSTM, and EMD-CNN models.Al-Janabi et al. [66] proposed a real-time intelligent forecasting system utilizing LSTM and PSO to predict primary pollutants for the next 48 h based on the information input from multiple stations in real time.Mokhtari et al. [67] developed a multipoint DL model based on Conv-LSTM to predict dynamic air quality levels, focusing on sudden pollution events.Furthermore, Hu et al. [68] proposed a Conv1D-LSTM and conducted extensive experiments to analyze the hourly PM 2.5 and PM 10 forecasts with a single-and multi-step prediction for the next 6 h.
However, the DL methods discussed for multi-step forecasting have limitations in extracting the long-term sequential characteristics of air quality data and forecasted only a few primary pollutants responsible for AQI.Therefore, state-of-the-art models with the Seq2Seq network, built on the encoder-decoder structure, address the issue by effectively handling temporal correlations in input and output sequences, leading to significant enhancements in multi-step forecasting tasks.Also, there are only limited studies on modeling temporal data with the encoder-decoder architecture for multi-step air quality forecasing.The encoder-decoder architecture employs a deep neural network (such as RNN or LSTM) to encode the input data into a fixed-length vector and utilizes another deep neural network to decode this vector into the target output sequence, generating predictions.For instance, Feng et al. [69] introduced an enhanced autoencoder (EnAutoformer) to improve air quality forecasting performance.Zhang et al. [70] developed an encoder-decoder model using an enhanced LSTM, known as read-first LSTM (RLSTM) as the encoder and LSTM as the decoder.
Despite the widespread adoption of encoder-decoders in multi-step forecasting, a limitation arises due to the potential loss of temporal information as the length of the input sequence increases during compression into a context vector with fixed-length encoding.To tackle this challenge, attention mechanisms are integrated into encoder-decoder models [71,72], enabling the model to focus on specific segments of the input sequence using attention weights and thus mitigating the risk of information loss.This mechanism assigns relative importance to each hidden state in the encoder, allowing the decoder to select relevant information from the input sequence to generate the output.It plays a crucial role in effectively extracting essential features and capturing long-range dependencies in long time series data, resulting in enhanced performance for multi-step forecasting.For instance, Dairi et al. [47] proposed a Variational Autoencoder (VAE) with multiple directed attention that outperformed RNN model variants in forecasting CO, NO 2 , O 3 , and SO 2 .Chen et al. [73] proposed the STA-LSTM, which incorporates an extreme value attention mechanism to mitigate the influence of sudden changes in air quality prediction.Li et al. [74] proposed an attention-based CNN-LSTM model that integrates an additional attention mechanism layer to capture spatiotemporal characteristics in the input data, demonstrating improved performance over the standard CNN-LSTM architecture.Du et al. [75] proposed a novel temporal attention encoder-decoder model based on Bi-LSTM to adaptively learn long-term dependencies and hidden correlation features for multi-step forecasting.Jia et al. [76] implemented a Seq2Seq framework with an attention mechanism to predict O 3 for the next 6 h.
However, there are limited works that effectively explore the encoder-decoder structure with attention mechanism in temporal modeling of air quality for multi-step forecasting.The works mainly fail to forecast all of the six primary pollutants contributing to the AQI, ignoring the influence of meteorological parameters.A notable research gap is that the existing studies have overlooked the implementation of multi-step forecasting on fog nodes to facilitate real-time decision support, reduced latency, and high accuracy.With these considerations, we propose a Seq2Seq GRU Attention model for accurate multi-step forecasting of the primary pollutants in determining the AQI for future time steps.Furthermore, the model is optimized using a model compression technique, as discussed in the next section for efficient deployment on the SFEG.

Technical Enablers for Fog Intelligence in IoT Environmenst: Model Compression Techniques and Hardware Exploration
In the forthcoming years, IoT-based smart city applications are expected to increasingly integrate AI services into their core operations, with the training/inference ratio of DNN models projected to increase from 1:1 to 1:5 [77].Consequently, significant advancements are anticipated through the adoption of AI-powered intelligence in Edge [78] and Fog Computing.Maccantelli et al. [79] suggest that embedded AI has proven effective in various IoT applications and is well suited for the monitoring of air quality in smart cities.
The Fog Intelligence paradigm brings the convergence of AI and Fog Computing to facilitate the execution of DL models partially or entirely to fog nodes located at the network's edge.This paradigm is particularly useful in the IoT (Internet of Things) environments like air quality monitoring where data are generated and need to be processed in real time for timely decisions, reducing latency and improving efficiency.Typically, DL model deployment in the IoT is categorized into three main groups: (1) DL training and inference on the cloud; (2) DL training and inference on the fog; and (3) DL training on the cloud and inference on the fog.Group 1, utilizing the centralized cloud for both training and inference, introduces challenges like high latency, elevated communication costs, network congestion, and privacy concerns.Group 2 faces challenges in DL training due to resource constraints on fog nodes and in terms of memory, processing, and power.However, Group 3 distributes tasks by training DL models in the cloud and deploying them in the fog [80].
To enable Group 3 deployment, the fog-cloud collaboration allows CC and FC layers to work together for model training, inference, and data sharing and to enable Fog Intelligence.Fog Intelligence reduces latency, improves scalability and bandwidth savings, and provides faster decision making for real-time IoT applications.Despite the benefits it offers, a significant challenge lies in achieving efficient Fog Intelligence through model compaction, creating optimized lightweight DL models suitable for resource-constrained fog nodes with limited hardware and memory.This requires optimizing model size, forecasting accuracy, and execution time of the DL models.Key enablers for achieving this include model compression and appropriate hardware selection.

Model Compression
Efficient Fog Intelligence involves deploying an optimized lightweight DL model on fog nodes.To achieve this, model compression methods such as quantization, pruning, and knowledge distillation can effectively reduce the size and complexity of DL models, thereby decreasing the memory footprint and computational requirements and reducing execution time while preserving model performance.
Quantization is a conversion technique that reduces the number of bits used to represent weights and activations by employing a smaller data type.This involves converting numerical values from a higher precision format, like 32-bit floating-point, to a lower precision format of 8-bit integers to represent layer inputs, weights, or both.Quantization techniques are classified into quantization-aware training (QAT) and post-training quantization (PTQ).PTQ includes dynamic range quantization, integer-only quantization, integer with float fallback quantization, and float16 quantization as depicted in Figure 1 [81].
Quantization offers three main advantages: reduced memory usage, reduced complexity of mathematical operations leading to faster inference time, and improved energy efficiency [82].Frameworks like Nvidia TensorRT, TensorFlow Lite, and OpenVino support quantization for the efficient execution on edge and fog devices.For instance, TensorFlow Lite is a lighter version of the TensorFlow framework that allows quantization to an 8-bit integer (i.e., −127 to +127) for resource-constrained devices, enhancing power efficiency and reducing memory usage.
execution time while preserving model performance.
Quantization is a conversion technique that reduces the number of bits used to represent weights and activations by employing a smaller data type.This involves converting numerical values from a higher precision format, like 32-bit floating-point, to a lower precision format of 8-bit integers to represent layer inputs, weights, or both.Quantization techniques are classified into quantization-aware training (QAT) and post-training quantization (PTQ).PTQ includes dynamic range quantization, integer-only quantization, integer with float fallback quantization, and float16 quantization as depicted in Figure 1 [81].Quantization offers three main advantages: reduced memory usage, reduced complexity of mathematical operations leading to faster inference time, and improved energy efficiency [82].Frameworks like Nvidia TensorRT, TensorFlow Lite, and OpenVino support quantization for the efficient execution on edge and fog devices.For instance, Ten-sorFlow Lite is a lighter version of the TensorFlow framework that allows quantization to an 8-bit integer (i.e., −127 to +127) for resource-constrained devices, enhancing power efficiency and reducing memory usage.
Pruning [83] involves removing the least significant parameters, such as weights and biases, to reduce the complexity and memory footprint of a neural networks.On the other Pruning [83] involves removing the least significant parameters, such as weights and biases, to reduce the complexity and memory footprint of a neural networks.On the other hand, knowledge distillation entails the transfer of knowledge from a larger, more complex model (the teacher) to a simpler, smaller model (the student) by training the student to mimic its output, minimizing computation cost and memory usage, making it applicable for resource-constrained devices to allow efficient inference.
In summary, the model compression methods [84] transfer the capabilities found in large ensembles into smaller, more concise predictive models to achieve efficient Fog Intelligence.Due to the iterative nature of training, most compression techniques are applied post-training as in PTQ.Advancements in Fog Intelligence necessitate research advancements into the methods and effectively using them for fog deployment.

Exploring Hardware for Fog Intelligence
Efficient Fog Intelligence requires deploying optimized lightweight DL models on suitable fog hardware.Recent advancements have enabled DL training and inference on resource-constrained devices, pushing intelligence to edge and fog nodes.It includes new hardware design and software frameworks enabling Edge and Fog Intelligence.The hardware supporting a range of AI algorithms include CPUs (Central Processing Units), GPUs (Graphics Processing Units), DSPs (Digital Signal Processors), FPGAs (Field-Programmable Gate Arrays), and specialized AI accelerators optimized for low-latency, power efficiency, and computational needs of AI at the edge [85].Devices like the Raspberry Pi and Nvidia TX2 are considered potential edge and fog devices [86].Table 3 provides a summary of a few potential edge and fog devices that enable the deployment of DL models, along with their specifications, showing that fog nodes are heterogeneous, particularly in terms of hardware capabilities.
The hardware specifications in Table 3 will help to select devices for proof-of-concept work related to edge and fog computing use cases.There are several AI accelerators and AI processors designed for edge and fog environments.Examples include the NVIDIA Jetson Series, Google Coral Edge TPU, and Intel Movidius Neural Compute Stick [87].
In addition, open-source AI frameworks are quite successful in developing, training, validating, and evaluating DL models, leveraging the acceleration offered by edge and fog hardware.Popular DL frameworks like TFLite, Caffe, and PyTorch have revolutionized Edge and Fog Intelligence.They operate best on GPUs and other specialized hardware devices to accelerate model training by reducing memory and computation requirements and model complexity, without compromising the model's accuracy.However, it is quite challenging to select an appropriate framework while developing fog computing systems due to the varying performances and heterogeneity of fog hardware.It is inferred that the combination of the AI accelerator's hardware efficiency and the flexibility of opensource frameworks enables developers to leverage the strengths of both aspects for efficient deployment of DL models on fog devices.From the discussion, it is inferred that the efficient deployment of DL models on fog devices requires model compression methods and suitable hardware choices, serving as key enablers for efficient Fog Intelligence.This emerging research area presents numerous challenges and opportunities.Developing an efficient Fog Intelligence system involves DL model compression techniques, appropriate fog device hardware, and collaboration between software and hardware components.Therefore, hardware-aware software-level optimization offers promising research directions [9].
Moreover, there is limited research on developing optimized lightweight DL-based air quality forecasting models specifically tailored for deployment on resource-constrained fog nodes that can effectively handle computational and memory constraints for efficient Fog Intelligence.Our work addresses this gap by developing an optimized lightweight DL model using model compression (PTQ technique) that is suitable for fog nodes.For the proof of concept, the Raspberry Pi 3 model B+ is chosen as the fog node and enabled as a Smart Fog Environmental Gateway (SFEG).The Raspberry Pi 3 Model B+, a configurable single-board computer (SBC), is chosen over other fog devices presented in Table 3 due to its cost-effectiveness, processing and networking capabilities, compactness, connectivity, and low power consumption.The proposed optimized lightweight DL effectively strikes a balance between high forecast accuracy, significant file size reduction, and improved execution time on SFEG deployment.Moreover, efficient Fog Intelligence introduced using this model enables real-time processing of air quality data at the edge, accurate multi-step forecasting, optimized utilization of fog resources, and decision support through timely early warnings and event responses in real time, ensuring optimal AQI levels in smart cities.

Proposed Approach: A Fog-Based IoT Architecture and Fog-Cloud Collaboration in the FAQMP System
This section discusses the Fog Computing architecture, the functionalities of the SFEG, and the implementation of fog-cloud collaboration.

Architecture of the FAQMP System and Hardware Implementation
The proposed FAQMP system adopts a three-layered architecture: the sensing layer at the bottom, the FC layer in between, and the CC layer at the top, as depicted in Figure 3.Each layer has distinct functionalities that collaborate to achieve an efficient end-to-end IoT system, managing data processing and analysis across different layers.The architecture is explained in detail as follows: • Sensing layer: The sensing layer serves as the foundation for monitoring air quality levels.The primary component of this layer is the AQM sensor node, designed with a PCB configured to function as an end device, as shown in Figure 2a.The PCB modularly integrates an array of dedicated low-cost sensors that acquire pollutant and meteorological parameters, along with a LoRa communication module connected to the controller unit.The controller unit is an Arduino Mega 2560, which is a low-cost, low-power, and resource-constrained microcontroller.Moreover, low-cost sensors have gained importance in facilitating dense deployments, greater coverage, and portability over traditional static monitoring systems.The sensors, as discussed in Table 4, are selected based on their cost, precision, accuracy, range, ability to monitor gases, lifetime, and compatibility with the controller.In particular, the sensors, including SDS011, MICS4514, MQ131, MQ136, BME280, pyranometer, and MPXV7002DP, measure the values of PM 2.5 , PM 10 , NO 2 , SO 2 , CO, O 3 , temperature, pressure, humidity, SR, WS, and WD.Moreover, due to variations between sensors in production, it is recommended to calibrate before deployment [27,31] to ensure accuracy in the measured values.Thus, the sensors in our AQM system are calibrated before data acquisition.For instance, Algorithm 1 presents the steps to pre-calibrate sensors like MQ136, where the coefficients x and y are extrapolated based on the characteristic curve presented in the datasheet [88].The pre-calibration ensures that the sensor provides accurate and reliable readings of the measured gas.The designed AQM system is a low-cost, low-power, multi-sensing, configurable, easily installable, and accurate system with long-range wireless data transmission capabilities to monitor air quality levels.Figure 3a shows the hardware of the proposed AQM sensor node that costs approximately USD 120 and is considerably less expensive than the traditional AQM systems for large-scale monitoring deployments.• Fog Computing (FC) Layer: FC is vital for air quality monitoring and forecasting due to its ability to offer computation and storage closer to the data sources with the benefits of minimized response time, optimized bandwidth efficiency, reliability, and reduced burden on the cloud.FC addresses the challenges of data processing, analysis, and transmission in dynamic environmental conditions.For the proof of concept, the proposed system utilizes a cost-effective Raspberry Pi 3 Model B+ as the fog gateway or fog node, as shown in Figure 3b.The fog gateway receives air quality sensor data from the sensing layer using LoRa module Dorji DRF1276DM.The received data are filtered and processed for further analysis.Fog intelligence is introduced by deploying an optimized DL model for on-device inference, enabling efficient processing by eliminating the need to constantly communicate with the cloud for processing.Moreover, the Early Warning System (EWS) detects anomalies and initiates an event response upon detecting dangerous AQI levels.Moreover, the real-time services offered by RPi for air quality data storage, management, communication, data analysis, early warnings, fog intelligence, and fog-cloud collaboration enable it to be a Smart Fog Environmental Gateway (SFEG).The services of the SFEG to manage the data and resources are detailed in Section 3.2.Furthermore, the forecast results and the air quality data are sent to the cloud using MQTT for historical storage and analysis.

•
Cloud Computing (CC) Layer: The cloud layer at the top of the hierarchy centralizes and manages the data obtained from the SFEG in the FC layer.It offers a robust infrastructure to store and process historical air quality data, manage fog nodes, train complex DL models, and serve end-user applications.We chose the AWS platform as it offers a comprehensive suite of secure services like AWS IoT Core, AWS Lambda, DynamoDB, S3, CloudWatch, and Sage Maker, making it an ideal choice for our  The proposed FAQMP system adopts a three-layered architecture: the sensing layer at the bottom, the FC layer in between, and the CC layer at the top, as depicted in Figure 2.Each layer has distinct functionalities that collaborate to achieve an efficient end-to-end IoT system, managing data processing and analysis across different layers.The architecture is explained in detail as follows:

•
Sensing layer: The sensing layer serves as the foundation for monitoring air quality levels.The primary component of this layer is the AQM sensor node, designed with a PCB configured to function as an end device, as shown in Figure 3a.The PCB modularly integrates an array of dedicated low-cost sensors that acquire pollutant and meteorological parameters, along with a LoRa communication module connected to the controller unit.The controller unit is an Arduino Mega 2560, which is a low-cost, low-power, and resource-constrained microcontroller.Moreover, low-cost sensors have gained importance in facilitating dense deployments, greater coverage, and portability over traditional static monitoring systems.The sensors, as discussed in Table 4, are selected based on their cost, precision, accuracy, range, ability to monitor gases, lifetime, and compatibility with the controller.In particular, the sensors, including SDS011, MICS4514, MQ131, MQ136, BME280, pyranometer, and MPXV7002DP, measure the values of PM2.5, PM10, NO2, SO2, CO, O3, temperature, pressure, humidity, SR, WS, and WD.Moreover, due to variations between sensors in production, it is recommended to calibrate before deployment [27,31] to ensure accuracy in the measured values.Thus, the sensors in our AQM system are calibrated before data acquisition.For instance, Algorithm 1 presents the steps to pre-calibrate sensors like MQ136, where the coefficients x and y are extrapolated based on the characteristic curve presented in the datasheet [88].The pre-calibration ensures that the sensor provides accurate and reliable readings of the measured gas.
Algorithm 1: Pre-Calibration of the MQ136 1:  calculation ( -Sensor resistance in the pure air); 2:  calculation ( -Sensor resistance in the presence of a specific gas); 3: Analog read sensor pin; The AQM sensor node is statically placed 5 m from the ground, referring to the CPCB guidelines.It periodically samples the air quality data every minute.After acquiring the heterogeneous air quality and meteorological data, they are unified and appended with the sensor ID and timestamp to form a payload.Each payload comprises twelve floatingpoint 32-bit numbers (PM 2.5 , PM 10 , NO 2, SO 2 , CO, O 3 , temperature, pressure, humidity, SR, WS, and WD) equal to 48 bytes and a 16-bit integer (authentication number) equal to 2 bytes.The total number of payload bytes to be transmitted from the sensor to the FC layer is 50 bytes, which equals 400 bits.This payload is periodically transmitted to the FC layer through LoRa.The LoRa communication module Dorji DRF1276DM [89] is a unique module embedded with a Semtech SX1276 LoRa chip operating on an ISM frequency band of 433 MHz that is integrated with the controller unit to transmit data to the FC layer.The process by which the end device sends data to the upper-level device in the FC layer is referred to as uplinking.On the other hand, downlinking refers to the process of transmitting data from the fog layer to end devices.The uplinking and downlinking refer to the direction of data transmission between end devices and the FC layer.To ensure reliable data transmission from the AQM node to the FC layer, the adopted radio parameters for the LoRa link are transmission power of +14 dBm, spreading factor (SF) of SF9, coding rate (CR) of 4/5, and bandwidth (BW) of 250 kHz.The transmission power of +14 dBm provides a good trade-off between range and power consumption, ensuring reliable data transmission to the FC layer without the need for excessive power, thereby conserving battery life.SF9 maintains a balance between range and data rate by effectively handling a 50-byte payload.A CR of 4/5 offers robust error correction to ensure data accuracy despite interference.A BW of 250 kHz supports a higher data rate and facilitates quicker and efficient transmission of the payload while maintaining a good range.The adopted LoRa configuration strikes a balance among power consumption, data rate, and range.
The designed AQM system is a low-cost, low-power, multi-sensing, configurable, easily installable, and accurate system with long-range wireless data transmission capabilities to monitor air quality levels.Figure 2a shows the hardware of the proposed AQM sensor node that costs approximately USD 120 and is considerably less expensive than the traditional AQM systems for large-scale monitoring deployments.

•
Fog Computing (FC) Layer: FC is vital for air quality monitoring and forecasting due to its ability to offer computation and storage closer to the data sources with the benefits of minimized response time, optimized bandwidth efficiency, reliability, and reduced burden on the cloud.FC addresses the challenges of data processing, analysis, and transmission in dynamic environmental conditions.For the proof of concept, the proposed system utilizes a cost-effective Raspberry Pi 3 Model B+ as the fog gateway or fog node, as shown in Figure 2b.The fog gateway receives air quality sensor data from the sensing layer using LoRa module Dorji DRF1276DM.The received data are filtered and processed for further analysis.Fog intelligence is introduced by deploying an optimized DL model for on-device inference, enabling efficient processing by eliminating the need to constantly communicate with the cloud for processing.Moreover, the Early Warning System (EWS) detects anomalies and initiates an event response upon detecting dangerous AQI levels.Moreover, the realtime services offered by RPi for air quality data storage, management, communication, data analysis, early warnings, fog intelligence, and fog-cloud collaboration enable it to be a Smart Fog Environmental Gateway (SFEG).The services of the SFEG to manage the data and resources are detailed in Section 3.2.Furthermore, the forecast results and the air quality data are sent to the cloud using MQTT for historical storage and analysis.

•
Cloud Computing (CC) Layer: The cloud layer at the top of the hierarchy centralizes and manages the data obtained from the SFEG in the FC layer.It offers a robust infrastructure to store and process historical air quality data, manage fog nodes, train complex DL models, and serve end-user applications.We chose the AWS platform as it offers a comprehensive suite of secure services like AWS IoT Core, AWS Lambda, DynamoDB, S3, CloudWatch, and Sage Maker, making it an ideal choice for our system requirements.Furthermore, a web application, namely EnviroWeb, is designed to present stakeholders with real-time air quality data, pollutant trends, AQI levels, forecasts, and early warnings using the data stored in the cloud.
Table 5 presents a summary of the tools, technologies, and methods utilized in the three layers of the FAQMP system.

Implementation of Fog-Cloud Collaboration in the Proposed FAQMP System
A significant distinction between the proposed FAQMP system and the traditional system lies in the incorporation of fog-cloud collaboration.Deploying a DL model via a cloud-based solution encounters communication latency, whereas edge-and fog-based solutions face challenges due to resource constraints.To address their standalone limitations, the fog and cloud platforms collaborate actively for model training and forecasting through fog-cloud collaboration.Fog-cloud collaboration entails integration and cooperation between the FC and CC layers for joint execution of DL tasks.As discussed earlier in Section 2.3, the deployment of a DL model for inference is categorized into three types based on its execution at different layers.However, the proposed system employs model training in the cloud and inferencing in the fog layer, fostering Fog Intelligence via fogcloud collaboration.This approach leverages the computational power and resources of the cloud alongside the real-time capabilities of FC to achieve fast and responsive inference.The realization of fog-cloud collaboration that enables Fog Intelligence is achieved through various functionalities of the SFEG and cloud, as illustrated in Figure 4 and detailed below.
source-constrained devices to support low-bandwidth, high-latency, and unreliable network environments.The SFEG publishes air quality data in JSON format under the topic "sfeg/air_quality_data" to the AWS IoT Core broker, which processes the messages and stores the data in DynamoDB via an AWS Lambda function.This setup enables seamless integration of the developed Enviro Web application with Dyna-moDB for accessing air quality data.The cloud publisher transmits the data to the cloud over Wi-Fi using MQTT.MQTT [5] is a lightweight publish/subscribe messaging protocol suitable for resource-constrained devices to support low-bandwidth, high-latency, and unreliable network environments.The SFEG publishes air quality data in JSON format under the topic "sfeg/air_quality_data" to the AWS IoT Core broker, which processes the messages and stores the data in DynamoDB via an AWS Lambda function.This setup enables seamless integration of the developed Enviro Web application with DynamoDB for accessing air quality data.

•
Early Warning System (EWS) Handler: The EWS Handler analyzes the live data and forecasted data to detect anomalies and provide insights into the potential issues with the air quality data, including seasonal deviations.If any anomaly exists in the live data based on the AQI threshold defined, it is referred to as a live data anomaly.
On the other hand, if an anomaly exists in the forecasted values, it is referred to as a prediction anomaly.Similarly, if an anomaly exists in both live and forecasted data, it is referred to as a discrepancy anomaly.While processing, if an anomaly is detected, the event variable is updated with the anomaly type (e.g., live data anomaly), and then the sub-module of EWS, namely the event response module, is activated as presented in Figure 4.

•
Event Response: Based on the nature of the event, the SFEG's event response module makes timely decisions and initiates appropriate actions to address the anomaly.These actions include information streaming, actuator control, and sensor network

•
Early Warning System (EWS) Handler: The EWS Handler analyzes the live data and forecasted data to detect anomalies and provide insights into the potential issues with the air quality data, including seasonal deviations.If any anomaly exists in the live data based on the AQI threshold defined, it is referred to as a live data anomaly.On the other hand, if an anomaly exists in the forecasted values, it is referred to as a prediction anomaly.Similarly, if an anomaly exists in both live and forecasted data, it is referred to as a discrepancy anomaly.While processing, if an anomaly is detected, the event variable is updated with the anomaly type (e.g., live data anomaly), and then the sub-module of EWS, namely the event response module, is activated as presented in Figure 4.
• Event Response: Based on the nature of the event, the SFEG's event response module makes timely decisions and initiates appropriate actions to address the anomaly.These actions include information streaming, actuator control, and sensor network configuration.
(a) Information Streaming: Timely alerts or notifications are sent to users via channels such as email, SMS, and EnviroWeb dashboards.This is crucial for providing immediate information about any detected anomalies, particularly those related to dangerous air quality levels.For testing purposes, we simulated a fire event where the AQI levels increased in the live data.This simulated spike activated the event response module, triggering an immediate reaction.
As a result, an email alert was generated and received, as shown in Figure 6, demonstrating how the system works in real time to notify stakeholders about hazardous air quality conditions.(b) Sensor Network Configuration: During a live data anomaly, commands are sent to the controller of the AQM sensor node via LoRa downlinking.These commands adjust the AQM sensor node's sampling rate to capture more detailed event information.(c) Actuator Control: During a live data anomaly, commands are sent to activate alarms, adjust ventilation systems, activate air purifiers, and control pollutant emission sources to maintain a healthy environment.This is crucial during time-critical hazardous events, such as dangerous air quality levels, fires, and gas leakages, to enable timely response by promptly reducing the severity of dangerous situations.
Sensors 2024, 24, x FOR PEER REVIEW where the AQI levels increased in the live data.This simulated spike a the event response module, triggering an immediate reaction.As a re email alert was generated and received, as shown in Figure 6, demon how the system works in real time to notify stakeholders about hazar quality conditions.(c) Actuator Control: During a live data anomaly, commands are sent to alarms, adjust ventilation systems, activate air purifiers, and control p emission sources to maintain a healthy environment.This is crucial duri critical hazardous events, such as dangerous air quality levels, fires, and g ages, to enable timely response by promptly reducing the severity of da situations.
The event response helps to avoid exposure to abnormal pollutant levels, mit pollution-related health risks, and address environmental concerns effectively.

•
Fog manager: The fog manager is mainly responsible for managing all the m of the SFEG.It includes tasks such as resource allocation, data processing, da mission, communication management, and overall coordination of activitie the FC environment.
In summary, the implementation of fog-cloud collaboration allows efficien tion of data, knowledge, and resources across the fog and cloud layers, resulting active decision making and improved responsiveness to address the requirement time IoT-based air quality application services.The event response helps to avoid exposure to abnormal pollutant levels, mitigate air pollution-related health risks, and address environmental concerns effectively.

•
Fog manager: The fog manager is mainly responsible for managing all the modules of the SFEG.It includes tasks such as resource allocation, data processing, data transmission, communication management, and overall coordination of activities within the FC environment.
In summary, the implementation of fog-cloud collaboration allows efficient utilization of data, knowledge, and resources across the fog and cloud layers, resulting in proactive decision making and improved responsiveness to address the requirements of real-time IoT-based air quality application services.

EnviroWeb Application
EnviroWeb, a user-friendly web application, empowers users with real-time air quality data and forecasting insights.By leveraging state-of-the-art technologies for air quality monitoring and prediction, EnviroWeb sets a new standard to help individuals and communities make informed decisions to protect their health and well-being and take a proactive step toward improved air quality levels.The main features of the Enviro Web application are presented below:

•
Real-time Air Quality Data: The application offers real-time measurements of air pollution and meteorological data, as well as the Air Quality Index (AQI), as shown in Figure 7.

•
Historical Data Analysis and Visualization: The application features customizable visualizations in charts and graphs to illustrate the historical pollution and AQI trends over a user-specified period (day, week, or month).• Air Quality Forecast and Recommendations: The application presents AQI predictions for the next 3 h at an interval of 15 min.Based on the live and forecasted air quality data, decisions are made and recommendations are presented to the users related to health, travel, lifestyle changes, behavioral adjustments, and actions to reduce pollution levels.
Sensors 2024, 24, x FOR PEER REVIEW 20 of 46 related to health, travel, lifestyle changes, behavioral adjustments, and actions to reduce pollution levels.The front-end application module is implemented through the Angular framework, a prevalent and extensively employed JavaScript platform to build responsive web applications.The backend uses AWS DynamoDB APIs to ensure seamless interaction with Dy-namoDB, allowing for efficient data management and retrieval of air quality data.

City-Wide Air Quality Management with FAQMP: Achieving Scalability and Real-Time Insights
Scaling the FAQMP system involves addressing the complexities and demands of monitoring and forecasting air quality on a larger, city-wide scale.This includes deploying thousands of AQM sensor nodes across multiple city locations, which creates a comprehensive network that delivers detailed and representative air quality data.This expanded network results in more accurate and reliable insights.This section explores the challenges and solutions associated with scaling the FAQMP system and highlights its real-world impact.The challenges and potential bottlenecks in scaling the FAQMP system to a city-wide scale is outlined below:

•
Smart Alerts: When an anomalous event or hotspot is detected based on the live and forecasted air quality levels, alerts are presented in the dashboard as a part of the EWS Handler's actuation (information streaming), as discussed previously in Section 3.2.

•
Maps: The interactive map uses color-coded markers based on the AQI level of a specific location, allowing the users to navigate and explore the surrounding AQM station locations and determine AQI hotspots.

•
Data Export and Sharing: The data export and sharing feature allows users to export the monitoring and forecast data.
The front-end application module is implemented through the Angular framework, a prevalent and extensively employed JavaScript platform to build responsive web applications.The backend uses AWS DynamoDB APIs to ensure seamless interaction with DynamoDB, allowing for efficient data management and retrieval of air quality data.

City-Wide Air Quality Management with FAQMP: Achieving Scalability and Real-Time Insights
Scaling the FAQMP system involves addressing the complexities and demands of monitoring and forecasting air quality on a larger, city-wide scale.This includes deploying thousands of AQM sensor nodes across multiple city locations, which creates a comprehensive network that delivers detailed and representative air quality data.This expanded network results in more accurate and reliable insights.This section explores the challenges and solutions associated with scaling the FAQMP system and highlights its real-world impact.

Challenges and Bottlenecks in Scaling the Proposed FAQMP System
The challenges and potential bottlenecks in scaling the FAQMP system to a city-wide scale is outlined below:

•
Data Volume and Management: Increased data from a growing number of sensors nodes will demand robust processing and storage solutions at fog nodes, such as the SFEGs.

•
Resource Management: As the number of AQM sensor node increases, managing computational resources efficiently becomes crucial, requiring careful resource allocation.

•
Communication Efficiency: Deployment of thousands of sensors leads to higher network traffic.Therefore, ensuring efficient communication among the sensor node, SFEG, and the cloud layer is essential for seamless data exchange and low-latency processing.

•
Resource Constraints: Fog nodes like SFEG typically have limited CPU, memory, and storage resources compared to cloud servers, a fact which constrains their ability to process large volumes of sensor data and run complex algorithms.

•
Real-time Processing: Handling high volumes of air quality data and complex forecasting models while maintaining minimal latency and timely responses is challenging.

•
Model adaption and Performance: DL models must adapt to varying air quality patterns and environmental conditions by continuous retraining, which can be resource-intensive.

•
Maintenance and Management: Managing and calibrating numerous sensors to ensure accurate air quality data are complex tasks.

•
Cost Management: Scaling involves higher costs for hardware, installation, maintenance, and ongoing operations.

•
Data Security: Ensuring the security and privacy of sensitive information is crucial.
While scaling the proposed FAQMP system, addressing the discussed issues is crucial to ensure effectiveness and efficiency in monitoring and forecasting air quality levels.

City-Wide Implementation of the FAQMP System-Addressing the Scalability Challenges
By examining the challenges in scaling the FAQMP system from the previous section, the following discussion presents strategies to address these complexities, ensuring the successful deployment and operation of thousands of sensors in a scaled environment.
Let us consider scaling the proposed FAQMP system as in Figure 8 by deploying AQM sensor nodes across various city locations like residential areas, city parks, and industrial zones with each collecting the air pollution and meteorological data.The decentralized, hierarchical Fog Computing architecture uses distributed fog nodes, i.e., SFEGs in the fog layer to manage and process data from the AQM nodes.Each AQM node connects to a local SFEG in the intermediary FC layer and transmits data via LoRa, supporting efficient communication over long distances.Moreover, setting up LoRa in a mesh network will support a dynamic allocation of resources and routing of tasks based on current network conditions, offering significant benefits, including enhanced network resilience, improved data routing, and increased coverage.Each SFEG aggregates, preprocesses, and filters the data collected from the connected AQM sensor nodes, minimizing the volume of data transmitted to the central cloud.Local storage at the SFEGs helps temporarily retain sensor data, reducing data loss.These strategies help mitigate network congestion, optimize bandwidth, and reduce cloud storage requirements.
To maintain efficiency and responsiveness, Fog Intelligence employs lightweight DL models optimized for SFEGs, providing real-time air quality forecasts with minimal response time and improved performance.This approach allows scaling without overburdening the limited resources of SFEG.Furthermore, to keep the system adaptive to changing conditions, the DL models are periodically retrained.Since doing this directly on the SFEGs would be too resource-intensive, the system uses the proposed fog-cloud collaboration strategy where model retraining occurs in the cloud and real-time inference on the SFEG.This collaboration manages resources effectively, minimizes latency, and ensures efficient model updates.For instance, if a fire is detected at any location of the city, the anomaly detection in the Early Warning System module alerts users through the EnviroWeb application, enabling timely responses and actions to critical events.The sensor network configuration captures detailed event information, and automated controls manage alarms and extinguishers.This automation scales operational capabilities without manual intervention, ensuring effective emergency responses.In addition, automated calibration adjusts sensors for drift and environmental changes.Moreover, cost-efficient hardware and technologies as in the proposed system help to manage the scaling expenses.
The above discussion clarifies how the FAQMP system can address scalability challenges such as managing large data volumes, optimizing resource allocation, maintaining efficient communication, ensuring real-time processing, model adaption, operational efficiency, and cost management, enabling it to effectively and efficiently monitor and forecast air quality levels across extensive urban areas.

The Role of the FAQMP System in Shaping Public Health Policies and Urban Development
The proposed FAQMP system tackles air quality challenges through a multi-faceted approach and is significantly impactful in shaping public health policies and urban development based on the following:

•
Public Health Protection: The system provides real-time AQI monitoring and forecasts via EnviroWeb, offering health advisories to vulnerable populations to reduce pollutant exposure.The recommendations will allow citizens to make informed decisions about outdoor activities and travel for an enhanced quality of life.

•
Timely Warnings and Responses: The EWS module of the FAQMP system detects anomalous pollution events and triggers alerts via email and EnviroWeb and captures detailed information through adaptive sensor sampling.Identifying pollution sources enables targeted interventions and regulations to decrease emissions.For instance, if a fire is detected at any location of the city, the anomaly detection in the Early Warning System module alerts users through the EnviroWeb application, enabling timely responses and actions to critical events.The sensor network configuration captures detailed event information, and automated controls manage alarms and extinguishers.This automation scales operational capabilities without manual intervention, ensuring effective emergency responses.In addition, automated calibration adjusts sensors for drift and environmental changes.Moreover, cost-efficient hardware and technologies as in the proposed system help to manage the scaling expenses.
The above discussion clarifies how the FAQMP system can address scalability challenges such as managing large data volumes, optimizing resource allocation, maintaining efficient communication, ensuring real-time processing, model adaption, operational efficiency, and cost management, enabling it to effectively and efficiently monitor and forecast air quality levels across extensive urban areas.

The Role of the FAQMP System in Shaping Public Health Policies and Urban Development
The proposed FAQMP system tackles air quality challenges through a multi-faceted approach and is significantly impactful in shaping public health policies and urban development based on the following:

•
Public Health Protection: The system provides real-time AQI monitoring and forecasts via EnviroWeb, offering health advisories to vulnerable populations to reduce pollutant exposure.The recommendations will allow citizens to make informed decisions about outdoor activities and travel for an enhanced quality of life.• Timely Warnings and Responses: The EWS module of the FAQMP system detects anomalous pollution events and triggers alerts via email and EnviroWeb and captures detailed information through adaptive sensor sampling.Identifying pollution sources enables targeted interventions and regulations to decrease emissions.

•
Proactive Measures: Multi-step forecasting helps predict future hour's air quality trends, allowing urban planners to implement preemptive strategies in real time and manage pollution peaks.

•
Dynamic Policy Adaptation: The FAQMP system enables the formulation and adjustment of policies related to environmental regulations, emission standards, industrial regulations, and urban design based on the real-time AQI data.

•
Enhanced Urban Planning and Resource Allocation: The FAQMP system will help urban planners identify pollution hotspots based on AQI levels, enabling optimized resource allocation for pollution control and urban infrastructure improvements, such as green spaces and buffer zones.• Traffic Management: The data from the FAQMP system will support optimizing traffic flow and congestion management strategies, leading to reduced vehicular emissions.

•
Community Engagement and Sustainable Development: The system promotes transparency and public awareness through EnviroWeb, fostering community involvement and sustainable development.

•
Economic Benefits: The FAQMP system reduces healthcare costs and environmental damage through efficient monitoring and targeted interventions.
In summary, this section detailed the system architecture, and the hardware and software implementation of the FAQMP system, discussed its scalability, and highlighted its impact in urban planning.

Methodology
In this research, a GRU-based Seq2Seq architecture with an attention mechanism is investigated for multivariate multi-step forecasting of air quality levels over future time steps.

Gated Recurrent Unit (GRU)
The Gated Recurrent Unit (GRU) [91] is an advanced variant of an RNN, designed to address vanishing and exploding gradient problems by means of its gating mechanism.It effectively learns the long-term dependencies in time series data through its simpler internal structure and shorter training time compared to LSTM.The GRU architecture as shown in Figure 9 has an update gate, which regulates the retention and integration of past and new information, and a reset gate, which regulates how much of the previous state is to be forgotten.This GRU design supports handling both short-term and long-term dependencies efficiently.
regulations, and urban design based on the real-time AQI data.

•
Enhanced Urban Planning and Resource Allocation: The FAQMP sys urban planners identify pollution hotspots based on AQI levels, enabli resource allocation for pollution control and urban infrastructure im such as green spaces and buffer zones.

•
Traffic Management: The data from the FAQMP system will support op fic flow and congestion management strategies, leading to reduced ve sions.

•
Community Engagement and Sustainable Development: The system pr parency and public awareness through EnviroWeb, fostering commu ment and sustainable development.

•
Economic Benefits: The FAQMP system reduces healthcare costs and e damage through efficient monitoring and targeted interventions.
In summary, this section detailed the system architecture, and the hardw ware implementation of the FAQMP system, discussed its scalability, and h impact in urban planning.

Methodology
In this research, a GRU-based Seq2Seq architecture with an attention investigated for multivariate multi-step forecasting of air quality levels ove steps.

Gated Recurrent Unit (GRU)
The Gated Recurrent Unit (GRU) [91] is an advanced variant of an RNN address vanishing and exploding gradient problems by means of its gating m effectively learns the long-term dependencies in time series data through i ternal structure and shorter training time compared to LSTM.The GRU a shown in Figure 9 has an update gate, which regulates the retention and past and new information, and a reset gate, which regulates how much of state is to be forgotten.This GRU design supports handling both short-te term dependencies efficiently.The equations of the update gate (z t ) and the reset gate (r t ) in the GRU in the following Equations ( 2)-( 6 The equations of the update gate (z t ) and the reset gate (r t ) in the GRU are presented in the following Equations ( 2)-( 6):

Sequence-to-Sequence (Seq2Seq) GRU Attention Model
A Seq2Seq model is an encoder-decoder structure designed to map an input sequence to a target sequence.However, a traditional encoder-decoder model has challenges with temporal information loss, especially with longer sequences, which can degrade performance.To overcome these limitations, the proposed Seq2Seq GRU Attention model incorporates an attention mechanism.This attention mechanism selectively emphasizes relevant parts of the input sequence, allowing the model [92] to focus on crucial information and manage dependencies across long sequences effectively.The temporal attention layer is positioned as the interface between the encoder and the decoder.The architecture of the Seq2Seq model with attention mechanism is shown in Figure 10.• Attention Score Calculation: The "attention score" or "alignment scor coder hidden state hi is calculated using a scoring function.The atte indicates how much importance the decoder's previous state places encoder state h i .The attention score is calculated as illustrated in Equa Let us consider the Seq2Seq GRU Attention model as a scholar working on a complex research article with the help of a comprehensive textbook.The encoder is analogous to the scholar, who reads and summarizes the textbook into a concise study guide, capturing the essential information and key points.The attention mechanism is like a highlighter that the scholar employs to identify the most critical sections of the study guide, emphasizing areas that are essential for comprehending the subject matter.Finally, the decoder is like the scholar writing the final article, utilizing the highlighted study material to produce a well-informed research paper.In air quality forecasting, the input data can be compared to the comprehensive textbook: the encoder uses GRU units to convert the input air quality data sequence into a compact, high-level summary known as the context vector.This summary captures essential patterns and trends from the historical data.The attention mechanism dynamically assigns different weights to various parts of the input sequence based on their significance in forecasting air quality levels.It highlights the most relevant parts of this vector for each forecasting step.The decoder employs GRU units to generate multi-step future predictions based on this highlighted information; i.e., it leverages the weighted context from the attention mechanism to make accurate predictions, considering the most relevant historical data for each forecast step.This process enables the model to convert extensive historical air quality data into actionable forecasts, like how a scholar turns extensive study material into a well-organized and insightful research paper.
The components of the Seq2Seq GRU Attention model are discussed in detail below 1.
Encoder: The encoder is a GRU that processes the given input sequence X = [x 0, x 1 , . . ..xT ] to generate a sequence of hidden states [ h 1 , h 2 , ....h T ], where T is the length of the input sequence.At each encoding time step t, the hidden state h t is updated by using both the input vector x t and the previous hidden state h t−1 , as illustrated in Equation ( 7). 2.
Attention Mechanism: The attention mechanism in the Seq2seq GRU-based Attention model allows focus on significant parts of the input sequence while generating output in the decoder, guided by attention scores.
• Attention Score Calculation: The "attention score" or "alignment score" for each encoder hidden state h i is calculated using a scoring function.The attention score e t,i indicates how much importance the decoder's previous state places on the specific encoder state h i .The attention score is calculated as illustrated in Equation ( 8).
where e t,i is the attention score based on the dot product of the vectors that signifies the correlation between the previous decoder's hidden state S T t−1 and i th hidden state of the encoder h i at time step t and T refers to the transpose operation.
• Attention weight calculation: Attention weights are the normalized version of the attention scores.After computing the attention scores e t,i , a softmax function is applied to these scores to obtain the temporal attention weights α t,i as displayed in Equation (9).
where T denotes the length of the input sequence.

•
Context vector calculation: The context vector is a fixed-size representation of the input sequence, calculated by combining the encoder's hidden states with the attention weights as illustrated in Equation ( 10): It represents a focused summary of the input sequence, with different elements weighted according to their relevance to the current decoding step t.Higher α t,i values indicate an increased significance of the associated hidden state ( h i ).This enables the attention mechanism to assign higher attention weights to elements in the input sequence that are more relevant to generating output at the current time step.The weighted summation mechanism in the context vector calculation allows the model to focus on different parts of the sequence dynamically and utilize relevant information from the hidden states of the encoder based on the attention weights during decoding.

3.
Decoder: The decoder is another GRU that reads the information from the context vector and its internal states to generate the output sequence.The context vector c t obtained from the attention mechanism is combined with the decoder's previous hidden state (s t−1 ) and previous target output (y t−1 ), then fed to the GRU unit to compute the current hidden state s t as in Equation (11).s t acts as an initial point to compute the output sequence.
where f is the GRU function.
• The output layer is a regression function that outputs the predicted value y t .The decoder generates the output sequence at each time step t based on the current hidden state s t , previous output y t−1 , and context vector c t as expressed in Equation ( 12).
• W is the weight matrix; b is a bias vector, and [c t , s t , y t−1 ] represents the concatenation of the context vector, the current hidden state, and the previous output.
Similarly, the steps involved in the attention mechanism (step 2) and the decoder (step 3) are repeated until the maximum sequence length is reached.

Experimental Evaluation
We carried out experiments to create an optimized lightweight model for air quality forecasting, aiming to enable efficient deployment of Deep Learning (DL) models for Fog Intelligence on the Smart Fog Environmental Gateway (SFEG).
Experiment I involved determining an accurate multivariate multi-step DL-based air quality forecasting model by comparing the state-of-the-art methods.Subsequently, this determined model is made lightweight by conversion to a TFLite model and optimized by applying the PTQ technique based on model compression as discussed in Section 2.3.1.Experiment II focused on validating the performance of the optimized lightweight model.This involved analyzing the effects of different quantization methods on the initial model to ensure efficient deployment on fog nodes by reducing model size and lowering execution time while maintaining high model accuracy.

Experiment I: DL-Based Multivariate Multi-Step Forecasting
Multivariate multi-step air quality forecasting refers to a prediction modeling task that forecasts values of pollutant variables for future h time steps based on the multiple input variables (air quality and meteorological variables), where h ∈ N * denotes the forecasting horizon.If h = 1, the forecasting is simplified to single-step forecasting.
This section demonstrates the effectiveness of the proposed Seq2Seq GRU Attention model for the multivariate multi-step air quality forecasting model by comparing its performance with baselines using historical air quality data.Figure 11 shows the stages involved in carrying out Experiment I.

Dataset Description
This study utilized historical air quality data obtained from the CPCB, a governmentestablished AQM station located in SIDCO Kurichi, Coimbatore, Tamil Nadu, India (latitude: 22.544697, longitude: 88.342606) [93].The dataset spans samples of one year, from June 2019 to June 2020, with 35,232 samples recorded at 15 min intervals, and is publicly accessible on the CPCB website.The investigative variables included the primary air pollutants and meteorological parameters such as PM2.5, PM10, NO2, SO2, CO, O3, humidity, WS, WD, and SR.The dimensions of PM2.5, PM10, SO2, NO2, and O3 are expressed in micrograms per cubic meter (µg/m 3 ), while CO is measured in milligrams per cubic meter (mg/m 3 ).The EDA indicates that the pollutant levels exhibit an increase during the winter months of December, January, and February, in contrast to the monsoon months of June and July.This is because the meteorological conditions greatly influence the air quality levels and play a crucial role in enhancing the forecasting performance [14].

Data Preprocessing
Data preprocessing is an important step in air quality modeling as it enhances the representation of collected data and improves model performance.It involves handling the missing values, addressing the outliers, normalizing the data, and performing a dataset split.The air quality data from AQM stations may reveal missing values and inconsistent measurements due to sensor malfunctioning, power failure, or influence of external and uncontrollable factors.In such cases, missing values increase the uncertainty of the data, making it difficult to effectively capture the temporal characteristics.As part of preprocessing, missing values are identified and imputed using the linear interpolation method as expressed in Equation ( 13), to ensure temporal continuity between the data points and improve the model's effectiveness.We then identified the outliers using IQR and replaced them using linear interpolation.

Dataset Description
This study utilized historical air quality data obtained from the CPCB, a governmentestablished AQM station located in SIDCO Kurichi, Coimbatore, Tamil Nadu, India (latitude: 22.544697, longitude: 88.342606) [93].The dataset spans samples of one year, from June 2019 to June 2020, with 35,232 samples recorded at 15 min intervals, and is publicly accessible on the CPCB website.The investigative variables included the primary air pollutants and meteorological parameters such as PM 2.5 , PM 10 , NO 2, SO 2 , CO, O 3 , humidity, WS, WD, and SR.The dimensions of PM 2.5 , PM 10 , SO 2 , NO 2 , and O 3 are expressed in micrograms per cubic meter (µg/m 3 ), while CO is measured in milligrams per cubic meter (mg/m 3 ).The EDA indicates that the pollutant levels exhibit an increase during the winter months of December, January, and February, in contrast to the monsoon months of June and July.This is because the meteorological conditions greatly influence the air quality levels and play a crucial role in enhancing the forecasting performance [14].

Data Preprocessing
Data preprocessing is an important step in air quality modeling as it enhances the representation of collected data and improves model performance.It involves handling the missing values, addressing the outliers, normalizing the data, and performing a dataset split.The air quality data from AQM stations may reveal missing values and inconsistent measurements due to sensor malfunctioning, power failure, or influence of external and uncontrollable factors.In such cases, missing values increase the uncertainty of the data, making it difficult to effectively capture the temporal characteristics.As part of preprocessing, missing values are identified and imputed using the linear interpolation method as expressed in Equation ( 13), to ensure temporal continuity between the data points and improve the model's effectiveness.We then identified the outliers using IQR and replaced them using linear interpolation.
y* is the missing value at the time t * , and (t 1 , y 1 ) and (t 2 , y 2 ) are the data points of two known samples.The data y 1 , y 2 , and y * are in a straight line, and t* is inside or outside the time interval [t 1 , t 2 ].In addition, the data are normalized using min-max scaling in the boundary of [0, 1] via a linear transformation as expressed in Equation ( 14).The normalization eliminates the influence of measurement scale differences among different features in the dataset and improves the model convergence, where x* represents the normalized feature value and x max and x min represent the maximum and minimum values of the features, respectively.The data are transformed into time series samples, and the train-test split is performed on the dataset, where the training set comprises 80% of the data and the testing set has 20% of the data.The training set is employed for model fitting, and the testing set is used to evaluate the performance of the model.
A multivariate time series of air quality data comprising variables of air pollutants and meteorological parameters recorded over consecutive time intervals is represented as X i = x i,j , (i = 1, 2, . . . . . .n; j = 1, 2, . . . . . .m), where i is the time dimension, n is the time series length, j is the feature dimension, and m represents the maximum value of variable dimension.X i vector refers to the values of air pollutants and meteorological variables at the i th time step.The representation can be illustrated as a two-dimensional matrix, as shown in Equation ( 15): x 11 x 12 . . .x 1m x 21 x 22 . . .x 2m . . . . . . . . . . . .
During the training of a multivariate multi-step forecasting model, historical time series samples are utilized.Each sample consists of observations of all the features over a fixed window size, serving as the inputs.The output pair comprises samples with a sequence of target variables for the future time steps.In the multi-output strategy, a single function F maps the input to the output pairs, as in Equation ( 16) during training.

Experimental Settings and Baselines
The experiments are conducted on an Apple MacBook Pro equipped with an M1 chip featuring an 8-core CPU and 8 GB of unified memory.The DL models are implemented in Python 3.7 using the Keras framework built on TensorFlow.
The state-of-the-art models, including GRU, Seq2Seq-GRU, GRU Autoencoder, GRU Attention, Seq2Seq LSTM Attention, Seq2Seq BiLSTM Attention, GRU-LSTM Autoencoder, LSTM-GRU, and LSTM-GRU Attention, are considered baselines for comparison against the proposed model.This section presents the experimental results analyzing the performance of the Seq2Seq GRU Attention model and baseline models in forecasting six primary pollutants responsible for the AQI: PM 2.5 , PM 10 , SO 2 , NO 2 , and O 3 over multiple future time steps.During testing, we fed the model with pollutants and meteorological data as the input to forecast pollutant concentrations for the next three hours and outputted pollutants in twelve consecutive time steps (i.e., next three hours) at a 15 min time interval.All models are tested on the same test set, and the forecasting performance is assessed in terms of RMSE, MAE, R 2 , MAPE, IA, and Theil's U1.
Although we analyzed the forecasting performance of all six pollutants, with page considerations, we present only the forecasting performance of the two significant pollutants, PM 2.5 and PM 10 , in Tables 7 and 8, at different stages of multi-step forecasting (1st time step to 12th time step).After analyzing the impact of forward forecasting step size on the forecasting performance for PM 2.5 and PM 10 in Tables 7 and 8, and other pollutants, a common observation reveals that as the forecast time step increases, the RMSE, MAE, MAPE, and Theil's U1 gradually increase, while R 2 decreases for all the models.This indicates that multi-step prediction poses a significant challenge compared to single-step forecasting, primarily because the forecasting horizon significantly impacts accuracy owing to the uncertainty in time series air quality data arising due to weather patterns, environmental conditions, and other factors that affect air quality.Therefore, it is crucial to consider the forecasting horizon when evaluating the forecasting models to ensure their effectiveness in providing reliable predictions over different time steps.Nevertheless, the proposed model exhibits good performance in multi-step forecasting, even when the time step increases.The results of the proposed model are highlighted in bold.
From analyzing the forecasting performance results of PM 2.5 in Table 7, the proposed Seq2Seq GRU Attention model achieved the best performance for long-term forecasting (t + 12), with the lowest RMSE (7.9083), MAE (5.4929), MAPE (27.4658), and U1 (0.177) and highest R 2 (0.5309) against the baseline models.On the contrary, the GRU has the highest error rates for single-step-ahead forecasting (t + 1) and long-term forecasting (t + 12) when compared to the other models that are hybrid RNNs and encoder-decoder-based models.Similarly, analyzing the performance results of PM 10 in Table 8, the proposed Seq2Seq GRU Attention model achieved the lowest RMSE (11.7805),MAE (8.3638), MAPE (29.323), and U1 (0.1925) and highest R 2 (0.6212) compared to the baselines for long-term forecasting (t + 12).This is mainly because the attention mechanism in the Seq2Seq GRU architecture of the proposed system improves the long-term forecasting performance of the pollutants.In addition, the line graphs in Figures 12-15 illustrate the performances of the various models compared in forecasting PM 2.5 and PM 10 concentrations over future time steps (t + 1 to t + 12).Moreover, it illustrates the ability of the proposed Seq2Seq GRU Attention model to maintain the lowest error and outperform the baselines, proving its effectiveness and stability in multi-step forecasting.In addition, Table 9 presents the performance metrics (RMSE, MAE, MAPE, R 2 , and U1) of various models for each pollutant (PM 2.5 , PM 10 , NO 2, SO 2 , CO, and O 3 ), with values representing the average errors across twelve time steps (t + 1 to t + 12) for each model and pollutant.The last column displays the average performance across all pollutants, providing a comprehensive assessment of the model's performance.In addition, Table 9 presents the performance metrics (RMSE, MAE, MAPE, R 2 , and U1) of various models for each pollutant (PM2.5, PM10, NO2, SO2, CO, and O3), with values representing the average errors across twelve time steps (t + 1 to t + 12) for each model and   10.In addition, Figure 16 presents the bar chart illustrating the performance metrics (RMSE, MAE, MAPE, R 2 , and Theil's U1) of various models across all pollutants over 12 time steps, showing that the Seq2seq GRU Attention model attains enhanced average forecasting performance over the baseline models.To summarize the extensive experimental analysis, the results show that the Seq2seq GRU Attention model demonstrates superior forecasting performance by outperforming the baseline models for accurate multivariate multi-step air quality forecasting.The attention mechanism effectively captures the relationship between current and past time To summarize the extensive experimental analysis, the results show that the Seq2seq GRU Attention model demonstrates superior forecasting performance by outperforming the baseline models for accurate multivariate multi-step air quality forecasting.The attention mechanism effectively captures the relationship between current and past time sequences to improve stability in long-term predictions of all the primary pollutants (PM 2.5 , PM 10 , NO 2, SO 2 , CO, and O 3 ) responsible for determining AQI levels.

Experiment II: Evaluation of an Optimized Lightweight DL Model for Efficient Fog Intelligence
After determining an accurate AQ forecasting model (initial model) based on Experiment I, the study developed an optimized lightweight variant to facilitate efficient model deployment for Fog Intelligence on the resource-constrained fog node (SFEG).To achieve this, we converted the initial DL model (Seq2SeqGRUAttention model) built using Tensor-Flow (TF) into a TensorFlow Lite (TFLite) model.During this conversion, TFLite provided support for quantization techniques based on model compression.This section evaluates and compares the resulting TFLite models on the fog node, i.e., the SFEG, resulting from different post-training quantization techniques like dynamic range quantization, integer with float fallback quantization, full-integer-only quantization, and float16 quantization.This analysis helps to determine the suitable post-training technique resulting in an optimized lightweight Seq2Seq GRU Attention model that effectively reduces the model's size and execution time while maintaining high forecast accuracy on the SFEG.This ensures that the model remains efficient and accurate.
The Raspberry Pi 3 Model B+, serving as the SFEG, is used to evaluate the performance of the TFLite models.It features a BCM2837 Quad-Core processor, 1 GB of RAM, builtin Wi-Fi, and Bluetooth 4.1 with BLE.Its support for TFLite, a lightweight version of • TFLite interpreter: The TFLite interpreter loads the TFLite model (optimized model), prepares it for execution, and enables on-device inferencing using the input data.It enables the efficient execution of TFLite models.
Together, the TFLite converter and interpreter enable an efficient deployment of TFbased DL models on fog devices with reduced memory footprint and inference.
We evaluated the resulting file size, execution time, and model accuracy of the Ten-sorFlow Lite model with and without quantization.Table 11 compares the file size of the original TF model and the TFLite model that is not optimized on the SFEG.The original file size measures 1176 KB, while the TFLite version is significantly smaller at 397 KB, representing a reduction of three times in size.File size reduction is a key step for resource-constrained devices with minimal storage.Furthermore, size reduction is achieved through post-training quantization.With the TFLite model's size of 397 KB as a reference, dynamic range quantization achieves a reduction of 70%, full-integer quantization achieves about a 68% reduction, and float16 quantization results in around a 48% reduction, as shown in Figure 17.Based on these findings, dynamic range quantization outperforms other quantization techniques, albeit only slightly surpassing full-integer quantization in terms of file size reduction.
The execution time to forecast the test data is measured, and the results are presented in Table 12.The original TF model has the highest execution time of 323.5977 s, as there is no model compression involved.Among the TFLite models with and without quantization, the TFLite model without quantization has the lowest execution time, as there is no quantization involved.Moreover, when considering the quantized TFLite models, dynamic range quantization has a slightly higher execution compared to the TFLite model without quantization because of the additional processing required for quantization.However, dynamic range quantization offers a good balance between model size reduction and execution.Although full-integer quantization can achieve a 69% reduction in file size compared to the TFLite model that is not quantized, it has a higher execution time compared to other methods, with an execution time of 68.8838 s.In addition, float16 quantization has an effective execution time with a latency of 54.7483 as compared to the other TFLite models with quantization.The execution time to forecast the test data is measured, and the results are presented in Table 12.The original TF model has the highest execution time of 323.5977 s, as there is no model compression involved.Among the TFLite models with and without quantization, the TFLite model without quantization has the lowest execution time, as there is no quantization involved.Moreover, when considering the quantized TFLite models, dynamic range quantization has a slightly higher execution compared to the TFLite model without quantization because of the additional processing required for quantization.However, dynamic range quantization offers a good balance between model size reduction and execution.Although full-integer quantization can achieve a 69% reduction in file size compared to the TFLite model that is not quantized, it has a higher execution time compared to other methods, with an execution time of 68.8838 s.In addition, float16 quantization has an effective execution time with a latency of 54.7483 as compared to the other TFLite models with quantization.In addition to considering model size and execution time, we also evaluated the accuracy of the model for post-training quantization.Table 13 illustrates how quantization affects the average model accuracy of the six pollutants across 12 time steps for the original TF model (i.e., Seq2Seq GRU Attention model determined from Table 9) and TFLite models.If the focus is mainly on model accuracy, opting for original TF model or TFLite without quantization can be considered.However, it is not the most effective option to reduce model size and improve execution time for efficient execution on the fog nodes.Hence, TFLite models with post-training quantization are considered.The TFLite models offered a file size reduction compared to the original TF models, while showing varying levels of execution time and accuracy.Dynamic range and float16 quantization methods maintain model accuracy like the TFLite model without quantization.In addition, the full-integer quantization method has lower accuracy compared to other methods.Specifically, the dynamic range quantization outperforms other TFLite models in terms of model accuracy and file size reduction, but with a slightly longer execution time than float16 quantization.
On the other hand, float16 quantization strikes a balance between accuracy and execution time.While it may offer slightly lower accuracy compared to dynamic range quantization, it compensates by providing the advantage of the shortest execution time.Additionally, float16 quantization achieves only a moderate reduction in file size.Moreover, full-integer quantization offers a good file size reduction but impacts model accuracy and execution time; i.e., file size reduction comes at the expense of model accuracy and longer execution times impacting the real-time responsiveness of the SFEG.Based on the experimental results, applying dynamic range quantization to the Seq2Seq GRU Attention results in an optimized lightweight model that maintains good model accuracy, offers a significant reduction in file size, and minimizes execution time, striking a good balance between performance and computational efficiency compared to other post-training quantization methods.The deployment of this optimized lightweight model on the SFEG will enable efficient Fog Intelligence and enhance the effectiveness of the FAQMP system through accurate multi-step air quality forecasting, timely early warnings through EnviroWeb and event response with reduced latency, and optimized fog resource utilization.

Conclusions and Future Works
Compared to the traditional air quality monitoring systems that process data in the centralized cloud, this study proposes a novel Fog-enabled Air Quality Monitoring and Prediction (FAQMP) system, based on IoT, Fog Computing, and Deep Learning for efficient real-time decision support in smart cities. Fog Computing brings storage, computation, and networking closer to the edge for faster processing, reduced latency, reliability, and minimized bandwidth consumption, reducing the burden of the cloud.In the three-layered architecture of the proposed FAQMP system, the sensing layer has a cost-effective AQM node designed with a customized PCB that interfaces an array of sensors to the Arduino Mega 2560 controller to acquire pollutant and meteorological parameters.These data are wirelessly transmitted to the Smart Fog Environmental Gateway (SFEG) in the fog layer using LoRa, a long-range, low-cost, and low-power LPWAN-based solution.Further, an efficient Fog Intelligence is facilitated in the SFEG through an optimized lightweight DL-based Seq2Seq GRU Attention model for real-time accurate forecasting, timely early warnings, and faster event response with effective utilization of fog resources.The Seq2Seq GRU Attention model for multivariate multi-step forecasting of air quality levels combines the strength of the Seq2Seq architecture with the attention mechanism.The experimental results reveal that Seq2Seq GRU Attention improved the forecasting accuracy and stability in comparison with the state-of-the-art DL methods, with an average RMSE of 5.5576, MAE of 3.4975, MAPE of 19.1991, R 2 of 0.6926, and Theil's U1 of 0.1325 across the six primary pollutants (PM 2.5 , CO 2 , CO, SO 2 , NO 2 , and O 3 ) for the future twelve time steps.Subsequently, to facilitate optimized model deployment on the resource-constrained fog nodes like SFEG, post-training quantization techniques like dynamic range quantization, integer with float fallback quantization, full-integer-only quantization, float16 quantization, are applied to the initial model.The evaluation shows that the results achieved through dynamic range quantization outperformed other methods with a significant reduction in file size, good forecast accuracy, and improved execution time, striking a balance between performance and computational efficiency.This makes the proposed optimized lightweight model suitable for deployment on the SFEG to enable efficient Fog Intelligence and thereby enhance the effectiveness of the FAQMP system.Furthermore, the EnviroWeb application presents real-time air quality data and alerts to the stakeholders.In summary, the proposed FAQMP system enables real-time and low-cost monitoring, efficient communication, accurate multi-step forecasting with optimized fog resource utilization, and decision support through timely early warnings, and event responses.This will empower informed decision making to address pollution concerns and maintain safe AQI levels in smart cities While intelligence in Fog Computing is still in its early stages, our study offers promising directions for further exploration.Despite its advantages, a limitation of this work is that the performance of the proposed optimized lightweight model has not been analyzed across various resource-constrained fog environments.Future work will address this by evaluating the model in different settings.Additionally, the accuracy of sensor data collected in the AQM node will be enhanced through automated calibration procedures to improve reliability and consistency in air quality measurements.

Algorithm 1 : 7 :
Pre-Calibration of the MQ136 1 : R 0 calculation (R 0 -Sensor resistance in the pure air); 2 : R g calculation R g -Sensor resistance in the presence of a specific gas); 3: Analog read sensor pin; 4: Collect various samples and determine the aggregate (S); 5 : R 0 =S/clean air factor; 6: Extrapolate coefficients x and y from datasheet; Estimate ppm values, ppm = x ×

Figure 2 .
Figure 2. A three-layered Fog Computing-based architecture of the proposed system.

Figure 3 .
Figure 3.A three-layered Fog Computing-based architecture of the proposed system.

Figure 4 .
Figure 4. Architecture and data flow of the proposed Fog-enabled Air Quality Monitoring and Prediction (FAQMP) System.• Cloud Model Orchestrator: The cloud model orchestrator manages the training, optimization, and storage of the DL model in the cloud for deployment on the SFEG.At first, AWS Sage Maker facilitates model training with historical air quality data by leveraging the powerful compute instances.Based on the comparative analysis of various DL models presented, the Seq2seq GRU Attention model has good multi-step forecasting performance as analyzed from the results in Section 5.1.6.To make the model lightweight and efficient for SFEG, dynamic range quantization based on PTQ in model compression is applied, reducing its size and execution time while

Figure 4 .
Figure 4. Architecture and data flow of the proposed Fog-enabled Air Quality Monitoring and Prediction (FAQMP) System.•NodeAuthentication: The SFEG's node authentication module authenticates the AQM sensor nodes to join the SFEG network for continuous transmission of air quality data.Initially, this module sends an authentication number to the AQM node in the sensing layer via LoRa downlink.The AQM node integrates the received authentication number with the collected air quality, forming a payload for LoRa uplinking to the SFEG.Daemons on the SFEG listen for incoming messages, extract live data, including the authentication number, and verify them.If the number matches, the AQM sensor node is authenticated and can send data, ensuring secure and reliable transmission.•DataHandler: The SFEG's data handler filters and preprocesses the received air quality data.Missing values are imputed using linear interpolation, and the AQI is calculated and appended to the preprocessed data.Preprocessing tasks like data cleaning, filtering, aggregation, and formatting enhance the data before analysis.The final prepared data, including PM 2.5 , PM 10 , NO 2, SO 2 , CO, O 3 , temperature, pressure, humidity, WS, WD, and SR, along with the AQI, are stored in the SFEG database to ensure seamless data recovery.SFEG data storage allows the system to remain stable and provide backup even during network outages and intermittent connectivity.

•
Cloud Model Orchestrator: The cloud model orchestrator manages the training, optimization, and storage of the DL model in the cloud for deployment on the SFEG.At first, AWS Sage Maker facilitates model training with historical air quality data by leveraging the powerful compute instances.Based on the comparative analysis of various DL models presented, the Seq2seq GRU Attention model has good multi-step forecasting performance as analyzed from the results in Section 5.1.6.To make the model lightweight and efficient for SFEG, dynamic range quantization based on PTQ in model compression is applied, reducing its size and execution time while maintaining model accuracy.This optimized lightweight Seq2seq GRU Attention model, chosen based on the experimental results from Section 5.2, is then stored in AWS S3. • Model Manager: By embracing fog-cloud collaboration, the model manager on the SFEG uses the model downloader to download the model from the AWS S3 bucket using the boto3 library.This model is deployed on the SFEG for inference, introducing efficient Fog Intelligence.Figure 5 illustrates the deployment pipeline [90] for the optimized DL model on a low-resource fog device like SFEG.The local air quality data are fetched, normalized, and fed into the model to generate multi-step forecasts for the next 3 h (12 time steps) at 15 min intervals in real time, without network delays.These live and meaningful forecasts are presented through the EnviroWeb application dashboard to the stakeholders.Sensors 2024, 24, x FOR PEER REVIEW 18 of 46 maintaining model accuracy.This optimized lightweight Seq2seq GRU Attention model, chosen based on the experimental results from Section 5.2, is then stored in AWS S3. • Model Manager: By embracing fog-cloud collaboration, the model manager on the SFEG uses the model downloader to download the model from the AWS S3 bucket using the boto3 library.This model is deployed on the SFEG for inference, introducing efficient Fog Intelligence.Figure 5 illustrates the deployment pipeline [90] for the optimized DL model on a low-resource fog device like SFEG.The local air quality data are fetched, normalized, and fed into the model to generate multi-step forecasts for the next 3 h (12 time steps) at 15 min intervals in real time, without network delays.These live and meaningful forecasts are presented through the EnviroWeb application dashboard to the stakeholders.

Figure 5 .
Figure 5. DL model deployment pipeline after model quantization.Furthermore, the error assessment component determines the cumulative forecasting error of the deployed model over time using Root Mean Square Error (RMSE).If the cumulative error exceeds a predefined threshold, the model update component requests retraining in the cloud with updated air quality samples.The model is retrained and optimized, stored in AWS S3, downloaded by the model downloader, and deployed for new forecasts.This process in the model manager establishes a feedback loop where model retraining allows for regular updates to model parameters ensuring adaptation to seasonal changes and long-term air quality trends, thereby maintaining accuracy over time.

Figure 5 .
Figure 5. DL model deployment pipeline after model quantization.Furthermore, the error assessment component determines the cumulative forecasting error of the deployed model over time using Root Mean Square Error (RMSE).If the cumulative error exceeds a predefined threshold, the model update component requests retraining in the cloud with updated air quality samples.The model is retrained and optimized, stored in AWS S3, downloaded by the model downloader, and deployed for new forecasts.This process in the model manager establishes a feedback loop where model retraining allows for regular updates to model parameters ensuring adaptation to seasonal changes and long-term air quality trends, thereby maintaining accuracy over time.

Figure 6 .
Figure 6.Real-time alerts triggered by anomalous AQI Levels via email.

Figure 6 .
Figure 6.Real-time alerts triggered by anomalous AQI Levels via email.

Figure 7 .
Figure 7. Graphical User Interface of the EnviroWeb application displaying the live pollutants, Air Quality Index (AQI) level, and recommendations for citizens in real time.•SmartAlerts: When an anomalous event or hotspot is detected based on the live and forecasted air quality levels, alerts are presented in the dashboard as a part of the EWS Handler's actuation (information streaming), as discussed previously in Section 3.2.•Maps:The interactive map uses color-coded markers based on the AQI level of a specific location, allowing the users to navigate and explore the surrounding AQM station locations and determine AQI hotspots.•DataExport and Sharing: The data export and sharing feature allows users to export the monitoring and forecast data.

Figure 7 .
Figure 7. Graphical User Interface of the EnviroWeb application displaying the live pollutants, Air Quality Index (AQI) level, and recommendations for citizens in real time.

Figure 8 .
Figure 8. City-wide implementation of the proposed FAQMP system.

Figure 8 .
Figure 8. City-wide implementation of the proposed FAQMP system.

Sensors 2024 ,Figure 10 .
Figure 10.Architecture of the Sequence-to-Sequence GRU Attention mechanism.

2 .
Attention Mechanism: The attention mechanism in the Seq2seq GRU-b model allows focus on significant parts of the input sequence while g put in the decoder, guided by attention scores.

Figure 10 .
Figure 10.Architecture of the Sequence-to-Sequence GRU Attention mechanism. 1q

Figure 11 .
Figure 11.Steps involved in multivariate multi-step air quality forecasting.

Figure 11 .
Figure 11.Steps involved in multivariate multi-step air quality forecasting.

Furthermore, size
reduction is achieved through post-training quantization.With the TFLite model's size of 397 KB as a reference, dynamic range quantization achieves a reduction of 70%, full-integer quantization achieves about a 68% reduction, and float16 quantization results in around a 48% reduction, as shown in Figure17.Based on these findings, dynamic range quantization outperforms other quantization techniques, albeit only slightly surpassing full-integer quantization in terms of file size reduction.

Table 1 .
AQI remarks by CPCB in India.

Table 2 .
Summary of the existing AQM systems.
√: Considered by the system, X: Not considered by the system.

Table 3 .
Comparison of hardware platforms for Fog Computing with their specifications.

Table 4 .
Sensors to monitor the pollutants and meteorological parameters.

Table 5 .
A summary of the tools and technologies in the FAQMP system.
) u t and r t represent the update gate and reset gate, respectively; denotes the candidate hidden state; h t represents the hidden state, and tanh is the hyperbolic tangent function.The activation function σ is used for both the forget and update gates.W r , W u , and W∼ h t are the learned weight matrices associated with the reset gate, update gate, and candidate output, respectively.b r , b u , and b h are the bias vectors, and * represents the element-wise multiplication.
∼ h t

Table 7 .
Evaluation of the DL-based multivariate multi-step forecasting models to forecast PM 2.5 for 12 consecutive time steps.

Table 7 .
Cont.The bold-font numbers represent the data in this study.

Table 8 .
Evaluation of the DL-based multivariate multi-step forecasting models to forecast PM 10 for 12 consecutive time steps.

Table 8 .
Cont.The bold-font numbers represent the data in this study.

Table 9 .
Comparative analysis of average forecasting errors (1st to 12th time steps) of various models for all the primary pollutants (PM 2.5 , PM 10 , NO 2 , SO 2 , CO, and O 3 ).

Table 9 .
Cont. 19%, the MAE decreases by 5.79%, MAPE decreases by 7.38%, Theil's U1 decreases by 9.19%, and R 2 increases by 0.1553.• The forecasting performance of the Autoencoder model (GRU Autoencoder), a variant of the encoder-decoder is superior to the Seq2Seq GRU for all the pollutants.Despite this, the hybrid variant of AE (GRU-LSTM Autoencoder) has better performance than the GRU-AE.Compared with the GRU Autoencoder, the RMSE, MAE, MAPE, and Theil's U1 of the GRU-LSTM Autoencoder decrease by 8.81%, 6.76%, 7.06%, and 7.24% respectively, and R 2 increases by 0.0931.• Moreover, adding an attention mechanism to the LSTM-GRU architecture, as seen in the LSTM-GRU Attention, led to an enhancement in forecasting performance.Compared with the LSTM-GRU, the RMSE, MAE, MAPE, and Theil's U1 of the LSTM-GRU Attention model decrease by 12.76%, 15.45%, 19.23%, and 15.14%, respectively, and R 2 is increased by 0.2022.The encoder-decoder-based attention variants (Seq2Seq LSTM Attention, Seq2Seq Bi-LSTM Attention, and Seq2Seq GRU Attention) exhibit improved performance over the Seq2Seq GRU.This indicates that introducing an attention mechanism overcomes the limitations of the traditional Seq2seq RNN models by dynamically focusing on relevant input sequences to capture contextual information critical for accurate predictions, mitigating information loss from fixed-length context vectors, addressing the vanishing gradient problem for effectively capturing long-range dependencies, and generating context-related forecasts with enhanced performance.• The Seq2Seq Bi-LSTM Attention exhibits a similar average forecasting performance compared to the GRU-LSTM Autoencoder; the latter demonstrates better efficacy specifically for the pollutants PM 2.5 , NO 2 , and O 3 .However, the proposed Seq2Seq GRU Attention model demonstrates the best average performance across twelve time steps for each of the pollutants, as well as superior average forecasting performance across all the pollutants in comparison with the baselines.Compared with the Seq2Seq Bi-LSTM Attention, the average forecasting performance of the proposed model across all the pollutants in terms of RMSE decreases by 18.27%, MAE decreases by 33.83%, MAPE decreases by 33.51%, Theil's U1 decreases by 18.95%, and R 2 increases by 28.70%.• The proposed Seq2Seq GRU Attention achieves the best average forecasting perfor- mance across six pollutants for 12 time steps, with an average RMSE of 5.5576, MAE of 3.4975, MAPE of 19.1991, R 2 of 0.6926, and Theil's U1 of 0.6926, as highlighted in bold in Table •Seq2Seq GRU exhibited improved average forecasting performance over the RNN variants (LSTM-GRU and GRU).This indicates that introducing an encoder-decoder into the RNN model is beneficial to enhance the forecasting performance.For instance, compared with the LSTM-GRU, the RMSE of Seq2Seq GRU decreases by 3.

Table 10 .
TensorFlow and TensorFlow Lite models file size comparison.

•
TFLite converter: The TFLite converter converts TF models into an optimized format by applying optimization techniques such as quantization, model pruning, and operator fusion to reduce model size and increase inference speed.It generates a TensorFlow Lite model file (.tflite) that contains the converted model in a format that the TFLite interpreter can handle.

Table 11 .
File size comparison TF and TFLite models.

Table 12 .
Comparison of execution time (seconds) for TFLite models.

Table 12 .
Comparison of execution time (seconds) for TFLite models.

Table 13 .
Comparison of average model accuracy of the six pollutants across 12 time steps for the original TF models and TFLite models.